Skip to content

Makes document attachment usage consider external attachments #1600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

Spoffy
Copy link
Contributor

@Spoffy Spoffy commented May 5, 2025

Context

Documents' attachment usage currently only considers files which are stored internally.

If a document has external attachments, they won't show in the document's "Size of attachments" bar in "Raw Data".

Proposed solution

This updates the attachment usage calculations to use _grist_Attachments.fileSize for external attachments, while preserving the existing logic for internal attachments.

Missing attachments

If a document has external attachments and is uploaded to a Grist instance, these missing attachments will still count towards the document's attachment usage. There's no easy way to mitigate this in all situations without checking all attachments on document open.

Untrusted file size value

This creates a problem with imported documents, which may have an incorrect fileSize value if it has been manually altered (i.e - the value is untrusted).

To mitigate this, the fileSize value is now updated when attachments are uploaded as a .tar archive. This means that eventually, fileSize is always valid for files which are present in external storage.

Related issues

#1021

Has this been tested?

  • 👍 yes, I added tests to the test suite

@Spoffy Spoffy added Attachments bug Something isn't working and removed bug Something isn't working labels May 5, 2025
@Spoffy
Copy link
Contributor Author

Spoffy commented May 5, 2025

This is dependent on the UI PR being merged.

This needs to be rebased once that is merged.

@Spoffy Spoffy changed the base branch from main to spoffy/archive-ui May 5, 2025 22:20
@Spoffy Spoffy marked this pull request as ready for review May 5, 2025 23:28
@berhalak berhalak self-assigned this May 6, 2025
@Spoffy Spoffy requested a review from paulfitz May 6, 2025 15:30
Copy link
Member

@paulfitz paulfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if we need some tests on just adding and removing attachments and checking that stats update correctly, when those attachments are external. I see some attachment accounting tests in test/nbrowser/DocumentUsage.ts in grist-saas. If those aren't entangled too much with billing would it be worth bringing them across? Or are there already tests?

const attachments = this.docData?.getMetaTable("_grist_Attachments").getRecords();
for (const attachmentRec of attachments ?? []) {
const newSize = newFileSizesByFileIdent.get(attachmentRec.fileIdent);
if (newSize) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about checking if the size is different from the old recorded size? I believe the data engine will prune changes where a value is written to a cell that already contains that value, but seems easy to cut it out at the source here.

@berhalak berhalak self-requested a review May 9, 2025 08:39
import {server} from 'test/nbrowser/testServer';
import {setupTestSuite} from 'test/nbrowser/testUtils';

describe('DocumentUsage', function() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this test over using DIFF? I see that you've changed the content of this file, and I don't have tools to show me the what was changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can tag which bits are the same? I've done a restructuring on this file to make testing with external attachments easier, so I'm not sure a diff would show many useful things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a rebase that should make this easier.

This commit has all the DocumentUsage changes in:
c90b777

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commit hash, had to do a rebase to sort a merge issue:
aacb5ee

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you maybe add another file for testing only external attachments and leave this one untouched?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of these tests fail if untouched 😅

I can isolate and duplicate the attachments tests if you want, and ensure they pass under both external and internal.

But I'm also not seeing the harm in having a pure-core version of this file? Happy to discuss further.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion:

  • Several of the non-attachments related tests will be moved to a new PR
  • This file will be renamed

async function makeSessionAndLogin() {
// login() needs an options object passing to bypass an optimization that causes .login()
// to think we're already logged in when we're not after using `server.restart()`.
// Without this we end up with bad credentials.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you end up with old credentials, not bad. You can be logged into grist using multiple accounts at the same time, this is a feature :).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done a slight rephrasing.

The original session ends up with old credentials, you're right! But any new session ends up with bad credentials - specifically the bearer token ends up as api_key_for_chimpy, which just gives us the wrong responses from the API 😅

@Spoffy Spoffy force-pushed the spoffy/attachment-size-calc branch from e0a0f53 to 871b9e1 Compare May 12, 2025 14:19
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide whitespace changes for this diff - it'll make it much smaller. A lot of it was just indentation changes.

@Spoffy Spoffy force-pushed the spoffy/attachment-size-calc branch from 6e4d36f to b2c0483 Compare May 12, 2025 14:33
Base automatically changed from spoffy/archive-ui to main May 12, 2025 14:42
@Spoffy Spoffy force-pushed the spoffy/attachment-size-calc branch from b2c0483 to bc5ae34 Compare May 12, 2025 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants